Lexicon: A enumeration <-> string converter.#4222
Lexicon: A enumeration <-> string converter.#4222SolidWallOfCode wants to merge 2 commits intoapache:masterfrom
Conversation
57360fb to
48281f6
Compare
|
Current issue is a bug in clang interacting badly with the GCC headers, making |
| Description | ||
| =========== | ||
|
|
||
| A :class:`Lexicon` is templated by the enumeration class. It contains a set of **definitions**, each |
There was a problem hiding this comment.
I think it's more clear to say "a class template Lexicon takes an enumeration type as its only parameter".
There was a problem hiding this comment.
I changed it around to specify the type should be numeric, rather than an enumeration specifically.
| name, the primary name or any secondary name will yield the same enumeration. | ||
|
|
||
| Defaults can be set so that any name or enumeration that does not match a definition yields the | ||
| default value. The default can be explicit or it can be a handler function. The handler functions as |
| default value. The default can be explicit or it can be a handler function. The handler functions as | ||
| an internal catch for undefined conversions and is generally used to log the failure while returning | ||
| a default. It could be used to compute a default but for names, this is problematic due to memory | ||
| ownership and thread safety issues. |
There was a problem hiding this comment.
Reworded a bit, and a foot note added.
| Assume the enumeration is | ||
|
|
||
| .. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc | ||
| :lines: 30 |
There was a problem hiding this comment.
If you're going to do this, I think you need a "WARNING: Do not even BREATH on this file!" comment in test_Lexicon.cc.
There was a problem hiding this comment.
The code this depends on is marked off as being part of the documentation. I really want to be able to know the example code compiles and runs.
There was a problem hiding this comment.
The problem is for maintainers of test_Lexicon.cc to know that they need to change the documentation as well if they add or remove lines.
Maybe you should put code like this in test_Lexicon.cc: https://godbolt.org/z/b1H8tA
| Insert the element :arg:`v` into the container. | ||
| Insert the element :arg:`v` into the container. If there are already elements with the same | ||
| key, :arg:`v` is inserted after these elements. | ||
|
|
There was a problem hiding this comment.
Should we call it IntrusiveHashMultimap for consistency with STL naming?
There was a problem hiding this comment.
Different PR, but no. Functionality is important, naming less so.
| template <typename Transform> | ||
| void | ||
| ATSHash64FNV1a::update(const void *data, size_t len, Transform xfrm) | ||
| ATSHash64FNV1a::update(const void *data, size_t len, Transform xf) |
There was a problem hiding this comment.
Why in this case are we using 'Transform xf' instead of 'const Transform &xf'?
There was a problem hiding this comment.
Oversight, I think the const& is a better choice in case Transform has internal state.
| */ | ||
| template <typename E> class Lexicon | ||
| { | ||
| using self_type = Lexicon; ///< Self reference type. |
There was a problem hiding this comment.
Why use alias that's longer than what it is aliasing?
There was a problem hiding this comment.
Consistency, clarity. The point of using self_type is to be clear in declaration that a method is returning an instance of itself.
| /// Storage for names. | ||
| MemArena _arena{1024}; | ||
| /// Access by name. | ||
| IntrusiveHashMap<typename Item::NameLinkage> _by_name; |
There was a problem hiding this comment.
I think it's been shown that, for small lookup tables, an array/vector with linear search is fastest. In any case, it's simpler and any difference in speed will be small for a small lookup table.
There was a problem hiding this comment.
Yes, but I don't agree that in this case it would be simpler, in terms of the implementation, in Lexicon. The mapping also gives us nice locality of equivalent keys.
| std::string_view localize(std::string_view name); | ||
|
|
||
| /// Storage for names. | ||
| MemArena _arena{1024}; |
There was a problem hiding this comment.
Won't the string names probably be string literals? If so, it seems a shame to copy them rather than just use the passed string_view.
There was a problem hiding this comment.
Usually, but that can't be guaranteed. Because it's a MemArena the amortized cost of the copy is small and I think that's worth paying to avoid any (likely complex) rules about input string lifetimes, or restricting Lexicon to only handling constant strings.
| for (decltype(len) j = 0; j < len; ++j) { | ||
| s[j] = char_gen(randu); | ||
| } | ||
| s[len] = 0; |
There was a problem hiding this comment.
Not sure, I suspect it's a left over from a prior iteration when it used a C-string style interface instead of string_view.
|
Just out of curiosity, is there any mechanism that ensure all enum items are in the lexicon? Let's say |
|
Ah, I found |
|
But I'm still curious how we can avoid the mismatch. If it was a trafficserver/proxy/http2/Http2DebugNames.cc Lines 29 to 47 in 4405411 |
|
There isn't any way, at compile time, to verify all of the enumeration values are in the I could look at creating a constructor like and then the initialization could be In this case, if there are not exactly |
|
A possible alternative is a macro-based approach: https://wkaras.webs.com/itemlist/itemlist.html |
|
The mismatch would not be a big issue. I just wanted to know if it's possible. |
maskit
left a comment
There was a problem hiding this comment.
Although I don't understand the implementation details but I'm ok with this. Only concern I have is maintainability. The code is much longer and more complicated than the switch-case example I posted above. I understand this is more general and more powerful, but I would switch back to switch-case way if I got into a trouble with this instead of fixing it.
|
@maskit Yes, but one of the key things this does the |
ea7ff14 to
0eea3f4
Compare
|
I added and documented a mechanism to do the size verification at compile time. Not quite as clunky as directly using a |
0eea3f4 to
8f254e6
Compare
| or more contributor license agreements. See the NOTICE file | ||
| distributed with this work for additional information | ||
| regarding copyright ownership. The ASF licenses this file | ||
| regarding copyright ownership. The ASF licenses this fileN |
There was a problem hiding this comment.
That's actually in a different PR this depends on. #4223.
8f254e6 to
13f9185
Compare
|
Leaving this until more of the infrastructure is in place. |
This is a class I have intended on build for years, it is based on a class of the same name and similar function I wrote for Network Geographics. It was very handy but it required using Boost.MultiIndex which clearly wasn't going to work in the TS codebase. But now I have the infrastructure pieces needed to make the implementation simple, along with some nice C++ eleventy features. I ended up writing this now because it will be very useful in the YAML conversion of WCCP.
Dependent on #4223.
Waiting on other infrastructure PRs.